Article 


]-[hammill Institute 
J ON DISABILITIES 


Research for Young Children With 
Autism Spectrum Disorders: Evidence 
of Social and Ecological Validity 


Topics in Early Childhood Special Education 
2016, Vol. 35(4) 223-233 
© Hammill Institute on Disabilities 2015 
Reprints and permissions: 
sagepub.com/journalsPermissions.nav 
DOI: 10.1 177/0271 121415585956 
tecse.sagepub.com 

®SAGE 


Jennifer R. Ledford, PhD 1 , Emilie Hall, BA 1 , Emily Conder, BA 1 , 
and Justin D. Lane, PhD 2 


Abstract 

The social and ecological validity of a body of research may impact the degree to which interventions will be used 
outside of research contexts. The purpose of this review was to determine the extent to which social and ecological 
validity were demonstrated for interventions designed to increase social skills for young children with autism spectrum 
disorders (ASD). Results indicated that although the percentage of studies including social validity assessment has remained 
stable over the 20-year review period, subjective assessments of social validity have increased and objective assessments 
have decreased. Acceptability was measured more often than feasibility or importance. Approximately half of the studies 
included indigenous implementers, typical social partners, or typical settings. Suggestions include additional research on 
the validity of measures, explicit reporting by researchers, and the use of multiple, objective, and psychometrically sound 
social validity assessments. 
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Single case research has contributed to the development of 
a number of interventions for young children with disabili¬ 
ties, many of which are behavioral in nature (Odom & 
Strain, 2002). Beginning nearly 40 years ago, intervention¬ 
ists and researchers using these interventions began arguing 
that evidence of effectiveness was not sufficient—evidence 
of relevance and practical significance was also needed 
(Brooks & Baumeister, 1977; Wolf, 1978). These character¬ 
istics can be considered as components of social and eco¬ 
logical validity. Wolf conceptualized three separate 
components of social validity: (a) significance of goals, (b) 
appropriateness of procedures, and (c) importance of 
effects. Ecological validity is a closely related construct: It 
refers to the likelihood that outcomes from a given study are 
meaningful outside the research context and might be 
referred to as the feasibility of implementation of an inter¬ 
vention (Brooks & Baumeister, 1977; Gast, 2014). Some 
researchers consider feasibility to be a component of social 
validity (cf. Machalicek, O'Reilly, Beretvas, Sigafoos, & 
Lancioni, 2007). Others have referred to studies having 
ecological validity when they are implemented in typical 
contexts (Clarke & Dunlap, 2008) or have suggested that 
the use of typical implementers and contexts result in 
improved social validity (Homer et al., 2005). Thus, eco¬ 
logical and social validity are associated constructs that are 
often highly valued in the fields of early childhood special 
education and early intervention (ECSE/EI). 


The perceived importance of social and ecological 
validity is due in part to the idea that acceptability, feasi¬ 
bility, and significance of a study may impact the “scaling 
up” of the related intervention for use in typical contexts. 
Contemporary researchers have argued that attempts to 
control environmental variability in educational research 
(e.g., minimizing the effects of implementer skill by 
including only highly trained implementers) may result in 
less variable outcomes and increased chance for experi¬ 
mental control; it might also considerably constrain the 
extent to which findings are generalizable (cf. Phillips, 
2014). Thus, researchers in ECSE/EI may opt for a greater 
likelihood of demonstrating experimental control, but at 
the cost of decreased ecological or social validity. This 
loss of social and ecological validity is critical because 
there may be a positive correlation between social validity 
and the extent to which interventions are used (cf. Carter 
& Pesko, 2008). Thus, when determining whether a practice 
is evidence based in ECSE/EI, it may be critical that the 
evidence for social and ecological validity be considered 
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Table I. Descriptions of Potential Evidence for Social and Ecological Validity. 


Component 

Feasibility 

Typical settings 
Typical activities 
Indigenous implementers 
Feasible financial supports 
Practical personnel supports 
Maintenance of intervention 
Maintenance of behavior change 
Interviews or questionnaires 
Acceptance/satisfaction 
Participant choice (goals) 
Participant choice (intervention) 
Consumer selection (goals) 
Consumer selection (intervention) 
Interviews or questionnaires 
Importance/significance 

Interviews or questionnaires 
Normative comparisons 
Blind ratings 


Description 


Study is conducted in the student’s home, school, or community-based settings 

Study is conducted during activities that naturally occur 

Study is conducted by adults or peers who are customarily present 

Materials used in the study are not cost-prohibitive for use in typical settings 

Supports provided to personnel are not prohibitive for use in typical settings 

Indigenous agents continue using intervention strategies when research support is removed 

Behavior change is maintained even when research support is removed 

Indirect consumers report the intervention can be implemented in typical settings 

Research participants choose target goals 

Research participants choose intervention components given those likely to be successful 
Non-participant consumers select the target goal as important 
Non-participant consumers select the intervention as acceptable 
Indirect consumers report the intervention and/or target behaviors are acceptable 

Indirect consumers report the behavior change was important 

Behavior change results in behavior consistent with or closer to that exhibited by peers 
Non-participant consumers who cannot ascertain condition (e.g., do not know whether they 
are rating pre- or post-intervention videos) rate target behavior change as important (e.g., 
rate behavior exhibited during or after intervention conditions as more desirable or less 
atypical than baseline behavior) 


(e.g., Can parents and practitioners reliably implement the 
intervention? Will they do so?). 

In a study using single case design, a consistent and rep¬ 
licated intervention effect (existence of a functional rela¬ 
tion) does not necessarily mean conclusions drawn are 
meaningful to either current participants (a question of 
social validity) or to potential future consumers (a question 
of ecological validity). We argue that at least three critical 
questions exist regarding ecological and social validity: (a) 
Is the intervention feasible? (b) Are the target behaviors, 
intervention, and outcomes acceptable and are consumers 
satisfied with them? and (c) Does the intervention result in 
significant or important change? Social and ecological 
validity components related to these three areas (feasibility, 
acceptability, and significance) are shown in Table 1. There 
is some overlap among these terms—for example, a parent 
might find intervention procedures acceptable only if the 
resulting child behavior change is significant. Moreover, 
social validity assessment may overlap with the assessment 
of effects—that is, that direct and indirect consumers may 
be unlikely to rate a change as significant if a functional 
relation is not demonstrated for primary measures (although 
there is some evidence that this is not necessarily true; 
Strain, Barton, & Dunlap, 2012). 

In previous reviews, researchers have noted insufficient 
measurement of social validity, and have called for 
improved measurement and reporting of the extent to which 
studies are ecologically and socially valid (Hurley, 2012; 


Kennedy, 1992; McDonald & Machalicek, 2013; Odom & 
Strain, 2002; Spear, Strickland-Cohen, Romer, & Albin, 
2013). Odom and Strain reported that only 15% of interven¬ 
tion studies for young children with disabilities assessed 
acceptability and only 27% assessed the importance of 
effects. Similarly, Hurley (2012) reported that only 27% of 
studies designed to improve social competence for pre¬ 
schoolers with disabilities included measures of social 
validity. Clarke and Dunlap (2008) reported variable inclu¬ 
sion of social validity measures in intervention studies for 
children and young adults with disabilities across three 
journals (3%—31 %). They reported higher rates of ecologi¬ 
cal validity, described in part as the use of typical physical 
contexts (19%—63%), activity contexts (24%-69%), or 
social contexts (22%-69%). However, Clarke and Dunlap 
reported relatively uncommon use of family (2%—10%) or 
teacher (4%—17%) implementation within their sample of 
studies. It may be that use of indigenous implementers and 
typical contexts is higher in research with young children 
with autism spectrum disorders (ASD), because intervening 
in natural contexts is valued in ECSE/EI, but this assump¬ 
tion has not been evaluated. 

Objective and Subjective Data 
Collection 

Social validity data are commonly collected using subjective 
post-intervention measures (Kong & Carta, 2011; Machalicek 
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et al., 2007); these measures are often used to gather opinions 
regarding acceptability and feasibility from consumers fol¬ 
lowing an intervention study. It is also possible to collect 
objective data regarding the feasibility, acceptability, satis¬ 
faction, and significance of target behaviors, interventions, 
and outcomes. One objective method for assessing the social 
validity of an intervention is to have blind raters (e.g., those 
who are unaware of the condition being implemented) report 
the extent to which observed behavior is typical, acceptable, 
or positive; the extent to which the procedures used are 
enjoyable, beneficial, or feasible; and the extent to which 
behavior changes between two different (but unlabeled) ses¬ 
sion types are positive, noticeable, or important. A second 
procedure for objectively assessing social validity is to com¬ 
pare outcome data with data from individuals who display of 
age-appropriate or socially acceptable behaviors (Ennis, 
Jolivette, Fredrick, & Alberto, 2013). In this case, authors 
might demonstrate that behavior of target participants during 
baseline conditions is considerably different from peer com¬ 
parison data, but that this difference is minimal, non-existent, 
or lessened during or after intervention. Although the pur¬ 
pose of an intervention is not always to completely amelio¬ 
rate a problem to levels expected of comparison peers, these 
comparisons may inform general expectations for research¬ 
ers when assessing meaningful change. Thus, objective mea¬ 
sures of social validity exist, although previous studies have 
suggested that they are not often used (cf. Kennedy, 1992; 
Spear et al., 2013). 

Another example of the objective measurement of 
acceptability is the use of participant choice. Hanley (2010) 
argued that it is possible to objectively assess the degree to 
which an intervention is acceptable for direct consumers 
(e.g., children, even those with limited communication 
skills), by providing treatment options and assessing 
choices (e.g., in a simultaneous treatments design; see 
Ledford, Wolery, & Gast, 2014). This could occur instead 
of or in addition to the usual subjective measures of accept¬ 
ability typically assessed via adults. Measures derived 
from objective sources are difficult to analyze because of 
inadequate evidence that they are psychometrically sound. 
An additional concern is that consumers may feel an obli¬ 
gation to report positive results because of perceived 
researcher preferences. This may be particularly important 
in single case research because the relatively small number 
of participants decreases the likelihood of anonymity in 
reporting. Results from these types of social validity 
assessments have been described as “indiscriminately posi¬ 
tive” (Machalicek et al., 2007). 

Although social and ecological validity are often mea¬ 
sured using subjective checklists and interviews, several 
research teams (cf. Ennis et al., 2013; Kennedy, 1992, 2003; 
Ledford et al., 2014) have suggested objective measure¬ 
ment, including (a) normative comparisons, (b) blind rat¬ 
ings, (c) evidence maintenance of use by indigenous 


implemented, (d) evidence of maintenance of behavior 
change, (e) use of participant choice, and (f) consideration 
of consumer preferences. The analysis of these data, along 
with assessment of the use of typical settings and indige¬ 
nous implementers, may allow consumers to determine to 
what extent a study shows evidence of social and ecological 
validity. 

Social Skills Intervention Research for 
Young Children With ASD 

One body of literature often controversial in regard to 
acceptability, feasibility, and significance is the literature 
assessing the effectiveness of social skills interventions 
for young children with ASD. Thus, meaningful measure¬ 
ment of social validity may be especially important in this 
area. Although recommended practices for young children 
include provision of intervention by indigenous imple¬ 
menters in typical social contexts (Division for Early 
Childhood of the Council for Exceptional Children, 2014), 
some of the most widely used and researched interven¬ 
tions may not meet these standards (e.g., discrete trial 
training conducted in a clinic setting by an advanced 
researcher). There is a longstanding debate about the use¬ 
fulness of these interventions that (a) may be unacceptable 
to parents and caregivers, (b) may not be feasible for 
widespread implementation, and (c) may be used to teach 
irrelevant skills to children who are in need of practical 
skills for social communication in typical contexts. Thus, 
despite the increased number of recommended practices 
for individuals with ASD (cf. Wong et al., 2014), the social 
validity of some interventions with research support may 
be questionable. 

Social skills impairments are one of the defining charac¬ 
teristics of ASD, and interventions designed to increase pro¬ 
social behaviors for children with ASD have the potential to 
have important and long-lasting impacts on social relation¬ 
ships and other outcomes, such as educational placement. A 
variety of interventions exist for increasing pro-social 
behaviors; reports of the effectiveness of these interven¬ 
tions, as a whole, have been positive (e.g., Reichow & 
Volkmar, 2010; Wong et al., 2014). However, even when a 
practice has been deemed evidence based, important ques¬ 
tions remain. Among these important questions are ones of 
feasibility, acceptability, and significance; social skills 
interventions may only be effective if indigenous imple¬ 
menters can and will use them and if the behavior change 
they produce is meaningful. This may be an especially 
important consideration in ECSE/EI settings, where chil¬ 
dren spend the majority of their time with parents and early 
childhood practitioners who may not have specific training 
in evidence-based practice. 

The purpose of this review was to analyze studies 
designed to increase social competence for young children 
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with ASD to determine to what extent evidence for social 
and ecological validity was present. Research questions 
were as follows: 

Research Question 1: To what extent do authors report 
evidence of social validity? 

Research Question 2: What types of social validity data 
are reported? 

Research Question 3: To what extent do studies include 
characteristics that show evidence of ecological 
validity? 

Research Question 4: Which type of evidence of eco¬ 
logical validity is most often reported? 

Method 

The studies included in this review were also included in a 
larger review of social skills interventions for individuals 
with ASD (Ledford, King, Harbin, & Zimmerman, 2015). 
Inclusion criteria for that review included (a) inclusion of 
an individual with ASD, (b) use of a single case design with 
at least three potential demonstrations of effect and visual 
data presentation, (c) assessment of a social skills interven¬ 
tion with a dependent variable related to human-to-human 
interactions, and (d) publication in a peer-reviewed journal 
between 1994 and 2013. An additional criterion for inclu¬ 
sion in the current review was that all participants included 
were 8 years of age or below and at least half were aged 5 
years or younger. Coding was conducted separately for each 
study, which was defined by the use of a stand-alone single 
case design; some articles included multiple studies. For 
example, an article with three A-B-A-B designs included 
three studies with three participants, whereas an article with 
a single multiple baseline across three participants design 
included a single study with three participants. 

General coding was done and reported as part of the pre¬ 
viously completed review of social skills interventions; 
these variables included year of publication, journal, inter¬ 
vention components, dependent variables, participant age, 
design type, and whether a functional relation existed. A 
doctoral level single case researcher completed general 
coding, with independent reliability conducted for 21% of 
all studies in the larger review by a second doctoral level 
researcher. Both coders were also certified behavior ana¬ 
lysts. Average agreement was 95.4% across codes. The 
presence of a functional relation was separately determined 
for each study, using a dichotomous yes/no decision. 
Determinations regarding functional relations were made 
using visual analysis, with consideration for changes in 
level, trend, and variability across conditions. Additional 
information about coding, including specific independent 
and dependent variable definitions, is available in the arti¬ 
cle describing the larger review (Ledford, King, et ah, 
2015). 


Variables specific to the questions of social and ecologi¬ 
cal validity were coded to answer the research questions for 
this review. These codes included whether authors self- 
reported the collection of social validity data and what type 
of data were reported (interview or questionnaire results, 
blind raters, normative comparisons, and other). In addi¬ 
tion, we coded whether direct and indirect consumers were 
provided with opportunities to make decisions regarding 
independent or dependent variables selection. Two graduate 
students assessed social and ecological validity variables 
for reviewed articles. For 34% of articles, both students 
independently coded all variables and the first author 
assessed agreement between coders. Across articles and 
codes, mean agreement was 96.7%, with a range of 89.2% 
to 100% across codes. Disagreements were reviewed and 
reconciled by the first author. 

Social Validity Codes 

• Interview or questionnaire results were coded as 
occurring if authors reported data on ratings of pro¬ 
cedures, objectives, and/or effects produced by chil¬ 
dren, practitioners, or caregivers directly or indirectly 
involved in the intervention. Anecdotal reports were 
excluded. 

• Blind ratings were coded as occurring if (a) raters 
were blind to the condition type when they observed 
(typically via video) and (b) these raters provided 
judgment ratings on child behavior, procedures, or 
outcomes that allowed direct comparisons between 
conditions. Primary data collection via blind raters 
was not included although blind data collection is 
beneficial and serves to decrease risk of bias and 
increase confidence in results (cf. Barton, Reichow, 
Schnitz, Smith, & Sherlock, 2015). 

• Normative comparisons were coded if authors used 
data collected from non-participating children to 
either (a) choose intervention criteria or (b) serve as 
a comparison for data collected from participating 
children. 

• The provision of choice to direct (e.g., children or 
implementing adults) or indirect consumers (e.g., par¬ 
ents, teachers) was coded if researchers reported that 
these individuals made choices regarding the inter¬ 
vention to be used or the behaviors to be targeted. 

For the three primary types of social validity assess¬ 
ments (interview or questionnaire results, blind ratings, 
and normative comparisons), we coded the degree to which 
results reported were positive. Because results within and 
across types were not directly comparable, we coded 
results as being only positive, only negative, or mixed. 
Mixed ratings could occur when results were positive for 
some items and negative for others (e.g., satisfaction was 
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rated favorably but feasibility was rated poorly) or when 
results were positive for some participants and negative for 
others. In addition to type and results, we also coded the 
domain authors reported: feasibility, acceptance or satis¬ 
faction, and importance or significance. Single or multiple 
domains could be reported by an author (e.g., if authors 
reported degree to which teachers found procedures accept¬ 
able and the extent to which they could use the procedures 
in the classrooms, acceptance and feasibility were coded). 

Ecological Validity Codes 

In addition to authors’ reports of objective and subjective 
social validity data reported in articles, we also coded the 
extent to which authors reported using procedures consis¬ 
tent with increased ecological validity. Specifically, we 
recorded, 

• Type of implementer and social partner (e.g., teacher, 
parent, peer, researcher). These codes could be dif¬ 
ferent within a study (e.g., a researcher implemented 
an intervention, while measuring changes in social 
responses to peer social partners). 

• Whether the implementer and social partners were 
indigenous to the child’s environment. Implemented 
and social partners were considered separately (as 
described above). In general, teachers, parents, and 
peers were considered as indigenous and researchers 
were considered as not indigenous. 

• Whether the settings and activities of the interven¬ 
tion were typical. A setting was considered typical if 
the participating children would have been present if 
he or she were not involved in research (e.g., homes 
were considered typical, clinics were considered not 
typical). An activity was considered typical if the 
participating children would have been engaging in 
the activity if he or she were not involved in research 
(e.g., free play was considered typical, being pulled 
out to engage in a social skills lesson was considered 
not typical). 

• Whether there was evidence of continued use by 
typical agents without researcher support. 

• Whether there was evidence of maintenance of behav¬ 
ior change when intervention was withdrawn. 

Results 

Studies included in the review included assessment of inter¬ 
ventions designed to increase pro-social behaviors directed 
at a human social partner for young children with ASD; 109 
studies in 54 published articles met inclusion criteria (see the 
appendix). Studies were defined as stand-alone single case 
designs; as such multiple studies could be included in a pub¬ 
lished article. Common intervention components included 


prompting (n = 43), environmental arrangement (n = 26), 
social skills training ( n = 22), peer training ( n = 21), and the 
use of responsive interactions (n = 20). Many of the studies 
were published in Journal of Positive Behavior Interventions 
(n = 25), Journal of Autism and Developmental Disorders 
(n = 24), or Journal of Applied Behavior Analysis (,n =19; 
with fewer studies published in 19 other journals). A variety 
of designs were represented, all with at least three potential 
demonstrations of effect, including multiple baseline designs 
across participants (n = 46) or behaviors (n = 18), A-B-A-B 
designs ( n = 16), and alternating treatments design (n =16; 
with 13 studies using multiple baseline or probe across con¬ 
texts or settings or multiple probe across participants). When 
using a stringent measure of effects based on consistent and 
replicated behavior change, the percentage of studies show¬ 
ing evidence of a functional relation was 53%. 

Measurement of Social Validity 

In this review, fewer than half of the included studies (n = 
48, 44%) reported measurement of social validity data. The 
most common type of data collected was post-intervention 
ratings completed via interview or questionnaire (n = 34). 
In contrast, relatively few studies used objective measures 
of social validity, including blind ratings (n = 12; e.g., 
Buffington, Krantz, McClannahan, & Poulson, 1998; 
Whalen & Schreibman, 2003) and normative comparisons 
(n = 9; e.g., McGee & Daly, 2007; Zanolli, Daggett, & 
Adams, 1996). Other measures of social validity included 
the extent to which direct consumers, including teachers, 
parents, and therapists, were given the opportunity to 
choose intervention targets (n = 15; cf. Crozier & Tincani, 
2007; Ingersoll & Wainer, 2013; Jones & Feeley, 2007) or 
intervention procedures (n = 0). Although no study reported 
choice of intervention procedures or components, several 
did report that consumers assisted in the selection of materi¬ 
als or contexts (cf. Kohler, Greteman, Raschke, & Highnam, 
2007; Maione & Mirenda, 2006). 

Measurement of Social Validity Over Time 

To determine changes in measurement over time, the pub¬ 
lication period was divided into four periods (each com¬ 
prised of 5 years). As shown in Figure 1, the percentage of 
studies reporting social validity data was relatively stable 
over time (3 8% during the first time period, 41 % during the 
final time period) with the highest percentage of studies 
reporting data during the 1999 to 2003 time period. 
Interestingly, measurement of subjective social validity, 
despite its limitations, increased considerably (5% during 
the first time period, 34% during the final time period). 
Meanwhile, measurement of objective social validity data 
has steadily decreased (31% during the first time period, 
14% during the final time period). 
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Figure I. Percentage of studies over time that measured and 
reported any type of social validity data, subjective data, and 
objective data. 

Measurement of Feasibility, Acceptability, and 
Significance 

Almost all of the studies using subjective social validity 
measurement reported measurement of acceptability or sat¬ 
isfaction (n = 32); fewer reported measuring feasibility (n = 
15) or significance (n = 15). Objective social validity mea¬ 
surement was most often reported to assess significance or 
importance (n = 12), and less often measured acceptance or 
satisfaction (n = 5) or feasibility (n = 1). 

Social Validity and Primary Study Results 

The percentage of studies demonstrating a functional 
relation for primary data was approximately the same for 
studies reporting social validity data (54%) as for the 
group of studies as a whole (53%). Approximately 80% of 
studies reporting the use of subjective questionnaires 
reported only positive results, whereas 20% reported 
mixed results. Results of objective social validity assess¬ 
ments (using blind ratings or normative comparisons) 
were similar to each other but quite different from those 
reported regarding subjective assessments. Fewer than 
half of the studies using blind ratings and normative com¬ 
parisons (42% and 44%, respectively) reported only posi¬ 
tive results, with the remainder of studies reporting mixed 
results. No study reported entirely negative social validity 
results. 

When comparing social validity results to primary data 
conclusions (e.g., functional relation present or absent), 
data were also discrepant. For studies with positive subjec¬ 
tive social validity data, 50% did not provide evidence of a 
functional relation. Flowever, 73% of studies with positive 
objective social validity results also had positive results 
regarding primary data (functional relation). This suggests 
positive objective social validity results are more likely to 


occur when primary effects are present; subjective social 
validity results are not similarly associated with primary 
results. 

Ecological Validity 

Approximately half of the studies reported evidence of 
common indicators of ecological validity, including indig¬ 
enous implementers (n = 49, 45%), typical social partners 
(n = 61, 56%), or typical settings (n = 62, 57%). Many of 
these studies (n = 42, 38%) reported evidence of all three 
indicators. Fewer studies (n = 30,28%) reported that studies 
occurred in the context of typically occurring activities 
(e.g., the study was conducted in a typical environment like 
an early childhood program, and also during regular ongo¬ 
ing activities in that setting). Very few studies reported 
whether indigenous implementers continued using the 
intervention after the treatment condition was completed 
(n = 8, 7%). 

Discussion 

The purpose of this review was to determine the extent to 
which studies designed to increase pro-social behaviors for 
young children with ASD reported evidence of social valid¬ 
ity and ecological validity. Results suggest that (a) about 
half of the studies report assessments of social validity; 
(b) about half of the studies report the use of indigenous 
implementers, typical social partners, or typical settings and 
more than a third report the use of all three; (c) measurement 
of social validity has been fairly stable over time, although 
objective measurement has decreased and subjective measure¬ 
ment has increased; (d) studies using subjective measures 
most often reported assessment of acceptability/satisfaction; 
(e) studies using objective measures most often reported 
assessment of importance/significance; and (f) positive results 
from objective measures, as compared with subjective mea¬ 
sures, were more often associated with the presence of a func¬ 
tional relation for primary outcomes. 

Limitations 

When interpreting conclusions from this review, several 
limitations should be considered, including (a) the exclu¬ 
sion of studies using group comparison designs, (b) the 
exclusion of studies that did not permit the evaluation of a 
functional relation (e.g., multiple baseline designs with two 
tiers, A-B-A-B designs without three data points in each 
condition), and (c) the restricted focus on participants with 
ASD. Despite these limitations and the need for future 
research in the area, some conclusions can be drawn regard¬ 
ing the social and ecological validity of interventions for 
young children with ASD. 
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Conclusion 

Although not directly comparable, the results from this 
review are somewhat discrepant from previous reports, 
which have found less frequent measurement of social 
validity in journals specific to behavior analysis (Carr, 
Austin, Britton, Kellum, & Bailey, 1999; Kennedy, 1992) 
and early intervention (Hurley, 2012). As part of the larger 
review, studies were excluded if determination of a func¬ 
tional relation was not possible (e.g., fewer than three data 
points per condition or fewer than three potential demon¬ 
strations of effect did not exist; Ledford, King, et al., 2015). 
This may have had the effect of excluding studies that were 
lower in quality; there may be a relationship between meth¬ 
odological rigor and social and ecological validity. Evidence 
for ecological validity was higher than in a previous review 
(Clarke & Dunlap, 2008), although it approximated results 
from one journal in that review with a specific aim related 
to ecological validity (Journal of Positive Behavior 
Interventions). Despite equal or better ecological and social 
validity than some reported reviews, the rate of positive 
outcomes (percentage of studies with a functional relation, 
53%) is rather low in comparison with both the other 
reviews and to the larger review of social skills interven¬ 
tions for individuals of all ages (Ledford, King, et al., 2015). 

The discrepant findings across social validity type sug¬ 
gest subjective and objective measures of social validity 
may focus on two different constructs, with subjective mea¬ 
surement often associated with satisfaction ratings and 
objective measures associated with assessments of signifi¬ 
cance. Most often, satisfaction ratings were researcher- 
designed, without evidence of adequate psychometric 
properties; several studies did report using established mea¬ 
sures (e.g., Intervention Rating Profile-15 [1RP-15]; 
Martens, Witt, Elliott, & Darveaux, 1985). Given the ques¬ 
tionable validity of researcher-developed measures, 
researchers should consider either using established mea¬ 
sures or using objective measures of social validity. 
Researchers should determine whether a combination of 
subjective and objective measures best captures measure¬ 
ment of the three components of social validity first 
described by Wolf (1978). For example, the appropriate¬ 
ness of procedures could be assessed by asking parents to 
select an intervention from a pool of effective strategies, in 
addition to asking if a prescribed strategy was acceptable on 
completion of the intervention. It is likely that multiple 
assessments of social validity are needed to adequately 
assess all components named by Wolf. Across studies, some 
components may be more important than others, based on 
specific research questions and goals. 

Although we have suggested that objective measurement 
of social validity data may have considerable benefits, the 
use of anecdotal and subjective assessments may also be 
beneficial. For example, anecdotal reports related to results 


provide insight into ways procedures can be modified to 
improve the extent to which practitioners can effectively 
use them in practice. In one study (Ledford, Lane, Shepley, 
& Rroll, 2015), a practitioner requested that a procedure 
involving randomly ordering materials be replaced with a 
different method. This modification was made, despite a 
small loss of contextual control. Post hoc narrative reports 
from practitioners can also be helpful, perhaps more helpful 
than the usual Likert-type scale rating. For example, data 
from this and other reviews (cf. Machalicek et al., 2007) 
suggest subjective ratings are likely to be positive, regard¬ 
less of the effectiveness of procedures. Instead of asking for 
simple ratings, more insight might be gained by asking 
practitioners questions such as “What would you change if 
you used this intervention?” “What child behavior would 
you target if you used this intervention?” or “What inter¬ 
vention would be more feasible for you to use?” It is possi¬ 
ble that researchers see Likert-type rating scales as an 
effective way to transform subjective ratings into quantita¬ 
tive data; we suggest, in light of these data, that the transfor¬ 
mation may result in invalid and undifferentiated outcomes. 
Some subjective ratings might be helpful and less prone to 
bias—for example, one method for assessing the appropri¬ 
ateness of an intervention, as well as feasibility and accept¬ 
ability, is to ask parents or practitioners to provide ratings 
regarding multiple interventions that could be used to 
increase a pro-social behavior (e.g., Which of the described 
procedures would be easiest for you to use?). Selecting the 
intervention most highly ranked by the consumer could 
enhance the usefulness of this method. Collecting this infor¬ 
mation would allow researchers to rank-order interventions 
by consumer or a proxy of the consumer. Moreover, this 
option would require considerably fewer resources than 
having blind raters view recordings of different types of 
interventions for essentially the same purpose—assessing 
acceptability of an intervention. Thus, decreasing the likeli¬ 
hood of bias may be possible with subjective measures, 
although this possibility should be evaluated. This is a criti¬ 
cal line of research, especially given that the increase of 
subjective measures in comparison with objective measures 
is likely at least partially explained by differential resource 
requirements. 

Additional research is needed to determine (a) the extent 
to which subjective measures may be biased under different 
conditions (e.g., data collected by researcher vs. unrelated 
personnel, data collected from direct vs. indirect consum¬ 
ers, data collected from validated vs. researcher-developed 
instruments), (b) the extent to which data collected from 
objective and subjective measures of social validity result 
in similar judgments (e.g., may be measuring similar or dif¬ 
ferent constructs), and (c) the possibility that social validity 
results that are entirely negative (rather than positive or 
mixed; none of which were reported in studies in this 
review) are not shared even when corresponding primary 
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data are published, potentially resulting in publication bias 
(for a discussion regarding this problem, see Schwartz & 
Baer, 1991). 

Based on current evidence, objective and valid measures 
of social and ecological validity are suggested for use in 
concert with anecdotal and narrative reports from direct and 
indirect consumers. Moreover, we suggest researchers (a) 
report specifically which facets of social validity are 
assessed by their measures (e.g., feasibility, acceptability, 
significance); (b) explicitly explain difficulties with inter¬ 
preting subjective measures, including the possibility that 
implemented may report positive results due to perceived 
wishes of the researchers; and (c) explicitly describe fea¬ 
tures of the study and social validity results that might 
enhance or impede use of a procedure in typical settings. 
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